(12) INTERNATIONAL APPLICATION PUBLISHED UNDER THE PATENT COOPERATION TREATY (PCT) 



(19) World Intellectual Property Organization 
Iniemaiional Bureau 

(43) International Publication Date 
27 November 2003 (27.1 1.2003) 




PCT 



(10) International Publication Number 

WO 03/098479 A2 



|(51) International Patent Classification': G06F 17/30 
i(21) International Application Number: PCTAJS03/ 14892 
(22) International Filing Date: 8 May 2003 (08.05.2003) 

(25) Filing Language: English 

(26) Publication Language: English 

(30) Priority Data: 

60/378,0 18 1 0 May 2002 ( 1 0.O5 .2002) US 

10/254383 24 Sepiember 2002 (24.09.2002) US 

(71) Applicant: ORACLE INTERNATIONAL CORPORA- 
TION [US/USl; 500 Oracle Parkway, Redwood Shores, 
CA 94065 (US). 

(72) Inventors: YALAMANCHl, Aravind; 8 Louishurg 
Square, Apt. 9, Nashua, NT-I 03060 (US). GAWLICK, 
Dieter; 757 Paul Avenue; Palo Alio, CA 94306 (US). 



SRINIVASAN, Jagannathan; 1 Hampshire Drive, Apt. 
F, Nashua, NH 03062 (US). 

(74) Agents: HENKHAUS, John et al.; HICKMAN 
PALERMO TRUONG & BECKER LLP, 1600 Wil- 
low Street, San Jose, CA 95125 (US). 

(81) Designated States (national): AE, AG, AL, AM, AT, AU, 

AZ, BA, BB, BG, BR, BY, BZ, CA, CH, CN, CO, CR, CU, 
CZ, DE, DK, DM, DZ, EC, EE, ES, R, GB, GD, GE, GH, 
GM, HR, HU, ID, IL, IN, IS, JP, KE, KG, KP, KR, KZ, LC, 
LK, LR, LS, LT, LU, LV, MA, MD, MG, MK, MM, MW, 
MX, MZ, Nl, NO, NZ, OM, PH, PL, PT, RO, RU, SC, SD, 
SE, SG, SK, SL, TJ, TM, TN, TR, TT, TZ, UA, UG, UZ, 
VC, VN, YU, ZA, ZM, ZW. 

(84) Designated States (regional): ARIPO paieni (GH, GM, 
KE, LS, MW, MZ, SD, SL, SZ, TZ, UG, ZM, ZW), 
Eurasian patent (AM,' AZ, BY, KG, KZ, MD, RU, TJ, TM), 
European patent (AT. BE, BG, CH, CY, CZ, DE, DK, EE, 

[Continued on next page] 



(54) Title: MANAGING EXPRESSIONS IN A DATABASE SYSTEM 



< 
OS 

00 

ON 

o 



O 



RECEIVE A FIRST QUERY THAT INCLUDES A 
HRST CONDITIONAL EXPRESSION 

202 


y 




REPRESENT THE FIRST CONDITIONAL 
EXPRESSION AS DATA IN A COLUMN OF A TABLE 

204 




f 


CREATE AN INDEX ON THE COLUMN 
206 


y 


f 


RECEIVE A SECOND QUERY THAT SPECIFIES A 
FIRST SET OF ONE OR MORE CRITERIA 
2QS 


y 


f 



EXECUTE THE SECOND QUERY TO SELECT DATA 
BASED AT LEAST ON WHETHER CONDITIONAL 
EXPRESSIONS IN THE COLUMN SATISFY THE 
FIRST SET OF CRITERIA 
21II 



(57) Abstract: Managing expressions includes 
receiving a firsl query thai includes a condilional 
expression. The expression is then represented 
as data in a column of a table. A second query 
is received that specifies a firsl set of criteria, 
and the second query is executed to select data 
based al leasi on whether expressions in the 
column satisfy the firsl set of criteria. In an 
embodiment, the second query further specifies 
a second sel of criteria, wherein executing the 
second query includes selecting data based 
on whether data in columns other than the 
expression column saiisfy the second criteria. A 
special index is defined, which can be created on 
the column thai stores the expressions, lo filler 
large sets of expressions efficiently. A method 
of evaluating an expression set stored as data 
in a table classifies each predicate from each 
expression, and filters the expression set based 
on the predicate classification. 



wo 03/098479 A2 



iiiiltilllifl 



ES, H. FR, GB, GR, HU, IE IT. ^ UC, NL, FT, RO 
«;E si SK. TR), OAPl patent (BF, BJ, CF, CO, (-1, cm, 
GA GN GQ, gW. ml. MR, NE, SN, TD, TG). 

uport receipt of thai report 



^.c nnd Other abbreviations, refer to the "Quid- 
ning of each regular issue of the PCT Oazette. 



BNSDOCID <WO____030BM79A2J.» 



wo 03/098479 



PCT/US()3/14892 



MANAGING EXPRESSIONS IN A DATABASE SYSTEM 

CROSS REFERENCE TO RELATED APPLICATIONS 
[0001] This application is related to and claims the benefit of priority from U.S. 
Provisional Patent Application No. 60/378,018 entitled 'Managing Expressions In A 
Database System/' filed on May 1 0, 2002, which is incorporated by reference in its 
entirety for all purposes, as if fully set forth herein. 

FIELD OF THE INVENTION 
[0002] The present invention relates generally to database management systems and, 
more specifically, to techniques for managing conditional expressions in database 
systems. 

BACKGROUND OF THE INVENTION 
[0003] In the context of event and content-based subscription systems, events are 
defined which, when met, trigger an action. For example, a subscriber can define rules 
that include events that define a state of content that, when met, trigger transmission of 
content to the subscriber. Using a database management system as an underlymg engine 
for an evaat-based subscription system, a subscriber can register queries with the system 
that represent conditional expressions on the content of the events. In such a subscription 
or similarly functioning system, a potentially very large set of queries, i.e., an expression 
set on the content, are registered to manage the publication of desired content data. When 
a given data item becomes available, these conditional expressions are filtered to find 
those expressions that match the given data item. 

[0004] A simple but inefficient approach to the taslc of filtering expression sets is to 
test all of the expressions in a given set for each data item. However, this approach is 
scalable neither for a large set of expressions nor for a high rate of events. Therefore, 
most commercial systems pre-process the expression set and create in-memory matching 
networks (i.e., specialized data structures) that group matching predicates in the 
expression set and share the processing cost across multiple expressions. 
[0005] Matching networks are decision trees in which each node represents a 
predicate group in a given expression set. Data flows fi-om a parent node to its children 
only if the data evaluates to tme for the predicate representing the parent node. A path 
from the root of the decision tree to a leaf node represents all the conjimctions in an 
expression. The leaf nodes in the tree are labeled with expression identifiers and if a data 
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expressioiis and related daB. aesirable to provide an improved 

,„„„„ Based on the foregomg, tt . ol«^ P ^ ^^^^^^^^^ 

^eoV^nis. tor n«na^g -pressrons. ^J^^^^^ ^ P-*- 
system, in addition, ^.creisarnorespectfio^--^^^^ ^^^^^^^ 

ability to filter expression, in conjunction mth filters 

,0„0S, -presentinventioni^nastr^^yo.— 
linntation, to the figures of a>e accompanymg dravnngs and m 
numerals refer to similar elements and in which: ^ 
,00091 FIG. 1 is anexampletable>«dtosuppottexampl=sofpr 

described herein; manaring expressions in a 

[00101 nG.2isaflowchartthatiUustratesaprocessformanagmg P 

database; - evaluatins aa expression 

loom FIG. 3 A is a flowchart that iUustrates a process for evaluating 

set that is stored as data in a colunm of a table; ^xoressions based on 

100121 FIG. 3B is a flowchart that illustrates a step of filtenng expressions 

predicate classifications; and .„™nnter svstem upon which an 

[00131 FIG. 4 is a block diagram that illustrates a computer system up 
embodiment of the invention may be implemented. 

DETAttiaDESCRSTIQH 
,00X4, Arriethod^dsys.=n.aredescrihedformana«in.express^^jad^- 

U --n.andmore.ec.c.^—-^^^ 
managing conditional expressions associated with event 
subscription systems. 



-2- 



<VtfO __0309B479A2_I_> 



wo 03/098479 



PCTAJS03/14892 



[0015] In the following description, for the purposes of explanation, numerous 
specific details are set forth in order to provide a thorough understanding of the present 
invention. It will be apparent, however, that the present invention may be practiced 
without these specific details, hi other instances, well-known structures and devices are 
shown in block diagram form in order to avoid unnecessarily obscuring the present 
invention. 

OVERVIEW 

[001 6] Conditional expressions, often in the forai of standard database query, are 
represented as data in a column of a table. These expressions may represent, for example, 
data filters for filtering data in an information subscription system. Anotlier standard 
database query, which specifies criteria, can then be executed on the colmnn to determine 
whether expressions in the column meet the specified criteria. The criteria may represent, 
for example, incoming data to a subscription system. 

[00171 Thus, the expression processing mechanism described herein is integrated with 
database technology by treating the expressions as data that can be queried along with 
other related user-specified data. For example, a pubUsher may mclude in the query other 
filtering criteria related to its subscribers. 

[0018] In an embodiment, a first queiy is received that includes a first conditional 
e^qjression. For example, the first query may be received via INSERT or database load 
operations. The first expression is then represented as data in a colunm of a table. A 
second query is received that specifies a first set of criteria, and the second query is 
executed to select data based at least on whether expressions in the column satisfy the 
first set of criteria. For example, the second query may be received firom a iiser of a 
database apphcation. 

[00191 I^ ^ embodiment, the second query fiarther specifies a second set of criteria, 
wherein executing the second query includes selecting data based on whether data in 
columns other than the column satisfy the second criteria. 

[00201 Other embodiments include receiving the first query from a subscribe: to an 
subscription ser\dce or system, wherein the first expression specifies criteria that must be 
satisfied by data for the data to be of interest to the subscriber; and receiving the second 
query from a publisher associated with the subscription sj^tem, wherein the second query 
specifies criteria that must be satisfied by data, with respect to the subscriber, for the data 
to be published to the subscriber. Thus, mutual filtering can be performed on data by 
both subscribers and pubUshers. 
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PRE-PROCESSING A SET OF EXPRESSIONS 
[0027] Given a large number of conditional expressions in a system, the expressions 
tend to have similarities at the elementary predicates level. That is, two expressions, each 
with one or more elementary predicates joined by conjunctions, may have a common 
predicate. Thus, expression evaluation costs are shared among multiple expressions, 
thereby leveraging the foregoing similarities. In an embodiment, a library of functions, 
the Expression Type Library, supports the pre-processing of a set of expressions. 
[0028] The Expression Type Library provides the basic functionaUty required for 
building a matching network for expressions. This Ubrary is supplied as a set of Java 
classes which are capable of converting an expression string into an expression tree 
containing elementary predicates, joined by conjunctions and disjunctions. This library 
can be used by any indexiag scheme for pre-processing the expressions ia an expression 
set and during incremental modifications to these expressions. 
[00291 This library parses the expressions and processes them as follows : 
[0030] (1) Normalize the expression, for example, by rearranging the predicates in an 
expression to rewrite it in a disjunctive normal form (DNF), that is, an OR Ust of AND 
sub-clauses. For example, an expression of form 

SYMBOL = 'GE' and (PRICE < 25 or PRICE > 35) 
is rewritten as follows after a DNF conversion: 

(SYMBOL = 'GE' and PRICE < 25) or (SYMBOL = 'GE' and PRICE > 35). 
[0031] (2) Normalize the predicates, by rewriting each predicate in an expression 
such that it has a pure constant on the right-hand side. For example, a predicate of fonn 

PRICE > 27 + CHANGE 
is rewritten as: 

PRICE - CHANGE > 27 
[0032] (3) Decode the predicate, by resolving it as follows: 

left-hand side (LHS): an expression or a function of one or more attributes 
(e.g., PRICE - CHANGE from the predicate PRICE - CHANGE > 27); 

right-hand side (RHS): a constant (e.g., the "27" from the predicate PRICE - 
CHANGE > 27); and 

relational operator: the relational operator joining the LHS and RHS (e.g., the 
">"from the predicate PRICE - CHANGE > 27). 

[0033] The left-hand side of a predicate, for example, the attribute SYMBOL, is also 
referred to as a complex attribute. It could also be a sub-expression involving one or 
more elementary attributes or user-dejBned functions, for example, CHANGE/PRICE. 
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predicate constants (RHb) are stOTBu y .„r,ctructs such as IN lists, sub- 

detailbelo.. In an W-tation, predicates mvolvmg constructs such 

queries, etc. are not grouped with other predicates. 

ATTRIBUTE SET 

s^uctuxe or context U.^ated. ^'^^^^'^ ^,^^'^1^^^ settes^Wes an 
3n expression set, and ^c. da., types. 1> tts snnplest " ^ 
abstra^ttypedeMtionin^edat^-e. Foiexatnple, aset of deme tary 
in an expression set constitutes its attribute set ^^^^^^j, expressed in the 

1O03S1 one non-lnniting technitpe for crealmg an attnbute set exp 

following commands: 

EXECUTE dbms_6xpeng.create_attribute_set ( 

attr_set=>'TlCK'); 
EXECUTE dbms_expeng.add_elementary_attribute ( 
attr_set =>'T1CK', 
attr_name=> 'SYMBOL', 
attr_type => 'VARCHAR2(6)'); 
EXECUTE dbms_expeng.add_elementary_attribute ( 
attr_set =>'TICK'. 
attr_name => 'PRICE', 
attr_type=> 'NUMBER'); 
EXECUTE dbms_expeng.add_elementary_attribute ( 
attr_set =>'TICK', 
attr_name=> 'CHANGE', 
attr_type=> 'NUMBER'); 

.bereby an at^ibnte set "TICK" is created, having elenaentaty at«ibutes "SYMBOI.". 
"PRICE", and "CHANGE". nsstoredinacolunmof a table. For example. 

expressions can be stored m a ° ^ ^^y, example, 
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the expression set is associated with the attribute set created for the expression set. One 
non-limiting technique for associating an attribute set with an expression set is expressed 
in the foUowing commands: 

EXECUTE dbins_expeng.assign_attribute_set ( 

attr_set => 'TICK', 

tab_name => 'TRADER', 

exp_column => 'INTEREST'); 
whereby the attribute set "TICK" is associated with an expression set stored in column 
'TNTTEREST" of table "TRADER". 

[0037] A VARCHAR2 or a CLOB column associated with an attribute set constitutes 
an EXPRESSION column. The values stored in an EXPRESSION column are treated as 
expressions and they are initially expected to adhere to SQL-WHERE clause format that 
can include XPATH expressions. These expressions can use all the attributes defined in 
the attribute set along with any system variables and user-defined functions that are valid 
in the user environment, for example, 

UPPER (symbol) = TNTC* AND change/price > 0. 1 . 

EXPRESSION FILTER 
[0038] According to an embodiment, an Expression Filter is a set of PL/SQL 
packages arid APIs to manage expressions in user tables, and to filter the expressions for a 
given data item, that is, to match criteria expressed in expressions with the. given data 
item, using a standard SQL or other query language query. In a publication system, the 
expressions specify criteria that must be satisfied by data for the data to be of interest to a 
subscriber. The Expression Filter comprises two components: an EVALUATE operator 
(described immediately below) and an Expression Filter Indextype (described under the 
heading "Creating An Index For The Expression Set")- 

EVALUATE OPERATOR 
[0039] A new operator is introduced that processes the expression set stored in an 
EXPRESSION colxmm. This operator can be used in the WHERE clause of a standard 
SQL, or a DML statement, to filter the expressions for a data item. The predicate on the 
expression set, using the new operator EV.AJLUATE, can be combined with any other 
predicate on the table. The EVALUATE operator accepts the name of the column storing 
the expressions, e.g., INTEREST, and a given data item as arguments. The EV.\LUATE 
operator intemally uses the expression set metadata to evaluate expressions for data items 
passed in. 
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10,^, M«a^pleofaque.y«-*=BVAI.UATBop»»r.a.fonows: 

SELECT * FROM traders 
^VHERE EVALUATE (traders.exp, 

■symbol=>"mtc", price=>32, chaiige=>3.3') = 1 

TV,<» pvnression is considered true It tue query cvaiu , ^ j 

Zin>uW-il=jotos.ndanyotad«abase query opera^ons^ng GROUP 
™ BY oU«.e HAVWQ cla>« In addition, filt^g a set ot 

expressions for a baton 01 A^exampleofsuchaquery is as , 

witbthetablestormgthedataitemsiscontemplated. Anexample 

follows: . FROM traders, dataitems 

SELECT distinct (dataitems.symbol), couiit( ) FKUM xrau 

WHERE EVALUATE (traders.exp. 

tick.getVarchar(dataitems.symbol, 

dataitems.price, 
dataitems.change)) = 1 

AND traders.city = "New York' 

GROUP BY dataitems-symbol; 
. TTCKis the nameofthe attribute set defined for the expression set, TRADERS isa 

where TICK IS the name oi ^^3^1e), andDATAIIEMS is a 

table that stores the expression set (see FIG. 1 for an ex p , .he exoression set. 
table that stores the dataitenisbeingprocessed,i.e., being coxuparedto the expiessions 

CREATING AN INDEX FOR THE EXPRESSION SET 
ro04^1 Testinoeveryexpressionforadataitemisalineartimesolution Whena 
[00421 Testm^e ry ^ ^^^^^ scalable for a high volume of 

a large sex oi CAP ^ ctrino This index can be defined on an 

..p«ssio„ se. tor a given da««™ -^^^^^J^^^^,^„f^,^ae. £or«>e 

evaluation of an expression set, based on the index usage 

,^Kiects Additionally, the information stored m these objects mam 
r^srle.prlon.etn.n.O»^operation.on«re.a.,es.,in,..ee^e..ons^ 
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EXPRESSION FILTER MDEX TYPE 
[0043] In an embodiment, the indexing scheme is implemented as a new Indextype, 
Expression Filter, using an extensible indexing framework. In an implementation, the 
indexing scheme is implemented as a new Indextype, Expression Filter, using the Oracle 
Extensible Indexing framework. The Expression Filter index type can be used to create 
an index on any set of expressions stored in a database colmnn of type VARCHAR2, 
CLOB or BFILE. However, use of another index type other than the foregoing, which 
may be used on expressions stored as data types, is contemplated and therefore within the 
scope of embodiments of the invention. 

[0044] The Expression Filter index type implementation parses a set of expressions 
and groups the predicates in the expressions into disjoint sets with matching Left-Hand- 
Sides. The data structures used to group the predicates in an expression set are relational 
in nature. In an implementation in which persistent database objects are used for the 
Expression Filter index data stracture, an example of such database objects are as follows: 

Predicate Table: a relational table that stores the predicates appearing in the 
expressions; 

Bitmap Indexes: one or more bitmap indexes on the predicate table; and 
Access Function: a function that queries the predicate table to filter the 
expressions for a data item. 

[0045] These objects collectively provide for efficient filtering of expressions based 
on both equality and range predicates. Furthermore, since the index stmcture objects are 
persistently stored in the database, memory constraints associated with the size of 
expression sets in prior approaches, which typically use main memory extensively, are 
not apphcable to the present embodiments. By contrast, operations using the present 
embodiments can store the necessary database blocks into a database buffer cache as ih&y 
are needed. 

[0046] As described above, expressions from a given set of expressions refer to a set 
of elementary attributes with fixed data types. Further, a set of vaUd values for these 
attributes constitute a data item, which is evaluated against these expressions. Hence, to 
index a set of expressions with the Expression Filter index type, all the elementary 
attributes used in the expression set should be associated with the database column 
storing the expressions. These elementary attributes, along with some optional complex 
attributes, constitute the attribute set for the expression set, which are stored in one or 
more data dictionary tables. One non-limiting technique for creating an Expression Filter 
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(3) Sparse predicates: the predicates belonging to this set are stored in their 
original form. If more than one sparse predicate exists for an expression, they are 
combined into one conjunction. Note that it is not always, efficient to add a new set of 
columns to the predicate table for every predicate that cannot be grouped with others. 
Hence, according to an embodiment, a separate VARCHAR2 column, 
PARTIAL^CONDITION, is defined in the predicate table to hold the conditional 
expression for sparse predicates. Unlilce other columns in the predicate table, this column 
is not indexed and it can hold one or more predicate definitions in conjunctive form. 
[0051] A predicate falls into one of the above sets based on the cost of computing its 
left-hand side and the frequaicy of occurrence of its left-hand side in the expression set. 
The evaluation cost for a predicate depends on the set it belongs to. 
[0052] Steps involved in evaluating a predicate, with respect to its classification as 
described above, are as follows. 

(1) Indexed attribute: a one-time computation of the complex attribute (i.e., 
LHS of the predicate group), and one or more range scans on the bitmap indexes using 
the computed value; 

(2) Stored attribute: a one-time computation of tiie complex attribute, and 
comparison of the computed value with the RHS of all the predicates in this group; and 

(3) Sparse predicate: parsing of the sub-expression representing the sparse 
predicate, and evaluation of the sub-expression through substitution of data values. 
[0053] During the expression set evaluation, according to an embodiment, the 
expressions are filtered in three phases, as follows. 

[0054] Phase 1 : The predicates belonging to the Indexed attribute set are tested by 
performing a few range scans on the bitmap indexes defined thereon. The results from 
these scans are combined (bitmap AND operation) to obtain a set of expressions for 
which these predicates are all true for the given data item. 
[0055] The total cost of this phase is defined as 

I * CU +N1 * C12 * log(El) + C13 * E2; 

where 

lis the number of bitmap indexes used for filtering; 
El is the number of expressions to be filtered; 
CJl is the average bitmap index processing cost per index; 
C72 * log(El) is the average cost of one index lookup; 
CI 3 is the cost of fetching one row fi-om the predicate table using the row 
identifier; 
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100561 TheresultofPhaselisasetofexpressionsthatevaluate 

p.edicatesbelon^g to ^e indexed attrib^es^^^^^ 
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^ wooing .e. is — d aown — ly in ^^^^^ ^ 
(00591 TheresultofPhase2isas«ot«P«ssio=sfhat«raluate 

prolicates belonging to the indexed or sWed a«ribn« se^ 

I0O6OI Pbase 3: For all tt.e expressions tot are ttue after Phase 2. the sp 

predicates (if any) tor these expressions are evaluated. 

[00611 The total cost of this phase is defined as follows 



p * C3 * E3; 
where 



where . . 

Pis the probability of a sparse predicate for an expression, 

rilLveragecostofparsingandevaluatingasparsepredicate^^^ 

S^^otalLber of expressions thataretrueafterPhase2offil.nng. 

.00621 iLretlTo Phase 3 is a set of expressions that evaluateto true for the given 
IL r::talternativeaccessplansotherthantheindexWevalua.on 

described above can be used. ^^^^g ^ juustrated in FIG. 1, which stores as 

[00631 "^^-^^^^^^'^^^^rj^andtheixinterestin^^^ In 
conditional expressions h^orxnation about stock traders an ^ork, and 

^ A 11 tv,. Ti? ADERS who are under 30 years of age, hvmg in Mew i 
order to find all the TRADERS^^ ho CHANGE=5.2), the following 

interested in a data item (SYMBOL^ ORCL , ^^^^^^ 

query is an exampleofaquery that canbe issued on^e TRADERS table. 

SELECT name, phone FROM traders 
WHERE EVALUATE (interest, 

>S.^OL='ORCL', PRICE=>31, CHANGE=>5.2') = 1 AND 



= .^^ew York' AND age < 30; 
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[0064] In practice, this query could make use of the predicate table and the bitmap 
indexes on the predicate table to filter the expressions. The Expression Filter index 
returns the rowids for all the expressions that evaluate to true \vith the given data item. 
Remaining predicates in the SELECT statement are evaluated on the corresponding rows 
to answer the query. In addition to the Expression Filter index on the INTEREST 
column, if the TRADERS table has native or extensible indexes defined on the CITY and 
AGE colimins, an optimizer program may choose one or more of these indexes, based on 
the access cost, to answer the above query. 

[0065] When a new row is inserted into the TRADERS table or some of the existing 
expressions are updated, the Expression Filter index is automatically maintained to reflect 
these changes. 

[0066] As the foregoing example query illustrates, the techniques described herein 
integrate expression filtering operations into database operations. In the context of an 
infonnation subscription-publication system, as a result of the integration of expressions 
and filtering mechanisms into a database system, mutual filtering from both a subscriber 
and a publisher can be performed, and performed efficiently', throu^ use of standard 
SQL statements. Therefore, multi-domain queries are possible, by joining tables and 
adding predicates to a query to furfher filter expressions. Furthermore, batch evaluation 
of expression sets for a given set of data is possible by joining the table storing the 
expressions with the table storing the data items. 

[0067] Prior approaches separate characteristics of system users or cUents, such as 
name, telephone number and residence, from their respective interests in data, which is 
expressed in subscription expressions. Many applications can benefit from this 
integration of interests and personal characteristics. For example, a query could be 
executed to retum "aU traders in New York with an interest in Stock X exceeding a price 
of Y" or "aU customers located within 10 miles of my store with an interest in vehicle Z." 
Significantly, present embodiments facilitate optimal filtering of expressions based on 
their context. For example, in the context of stock trading, this context leads to efficient 
filtering of data by grouping predicates on the SYMBOL attribute, since the vast majority' 
of users will have interests relative to a specific stock represented by a stock symbol, and 
by indexing operators and constants associated with predicates that include the SYMBOL 
attribute. 
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PROCESSFORMANAGINGEXPEESSIONSINADATABASE 

1,06,, no. 2 is a fiowcharttatmu^ra^s aprocss for paging » ^ 

da«b«. no. 2 isdescribcd^itoaddittonalrcfercaceto thetable ofHG. 1. 

rion. Fo.«xan^le.theW,ucry™yber»eiv,dvUmSERToraatabas ..ad 

;r»os. an ^^o^^ oondittonal exprcssiona a« expec.=d to be u> .be fonn of a 
rwHBRBCaus. F^. any SQL WHERB clana. caa be „.a.ed as an exp«.^^^^ 

>n a«e,»adve enibodtae... condiSonal expressions having a form a SQL 

and having a fOmr o«>er aan an, of query, are —lated. For ex^^. 

Lly specify intcrcs. in dara via siraple .cx^al inpu. whrch rs converTed *re^^ 
into M ]^ Jropxiate data forma, such as V ARCHAR2 or CU,B. for s«.rmg Hr a oolunan 

of a table. , . 

[00701 lnanembodim=n.,<hefirs.<peryisre«ivedftomasubscnbertt,an 

Lion subsoription system, or oth« even, or con.en.-bas»l infcrrn^ron ^bUc^on 
sy^enrwhereinare expression specifies cri.eriama.mus.beme.byda«>forfte da.a» 

be of ^.e^s, .0 a particular subscriber, -n-us. me firs, query can serve as a mech^ 
for a firs, leve. of filtering wifi, ,esp«. to system data, essentially processmg da,a for 
disoatth or publioatton to appropriate subscribers. 

lZ» Atbloclc204,*efirstcoudifionalexpressionof,Jrefirs.,ue,y.ex^^^as 
data ^ a column of a .able, as de^ibed above and as depiced as column l 
For example, one or more predicates oonstimen. «, fire expression are -«>|"* »^ 
tonua. on which SQL or anofirer query language can ope.a«, and on whrch a da^ 
manasemen, system canmanagcNotefiratflrerepresen^rion of expressronsmto 

MEREST column 106 of the table of FIG. 1 is for purposes of explanafou. and do no. 
ra^e^eformasdepicted Kather, the expressions stored in tire tf^T column, or 
any sinular column in which expression representations are s.ore4 are aomally 
conveahonally encoded iuto an appropriatt data fenua. „m. in which 

[00721 At optional block 206, an urdex is ^ated on ^e column of the table m wtach 
L c^aditional expressions are stored as data a. block 204. hr fins embodiment an mdox 

is created as described above. ^ -u >i ^„ 

,00731 Indexing steps include grouping predicates fiom a set of expres^ons, based on 

predicate identifiers that are assocated wiU- respective cri»ria of tire ^J^^'^^ 
Continuingwi*theexample.pred.ca.egroupidentifiersmightincludeSYMBOL. 

PWCE and CHANGE. SeB of predrcate operators and constants are stored m a predrcate 
tablei.; association wifir respective predicate identifiers. For examplcapredrcate 
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grouping for SYMBOL may have entries in the predicate table representing sets of 
operator and constant combinations, such as operator is "equal to" and constant is 
"ORCL", and a predicate grouping for PRICE may have entries in the predicate table that 
include operator and constant combinations such as "greater than or equal to" and "31". 
These sets of operators and constants can be encoded in any conventional manner 
appropriate for their purpose. Next, a concatenated bitmap index, or another form of 
index, is created based on a set of predicate operators and constants associated with a 
particular predicate identifier. For example, a set of operator and constant combinations, 
represented in respective columns of a predicate table, for the predicate identifier 
SYMBOL, is indexed for fast and efficient evaluation of predicates and thus, expressions, 
as part of a data filtering process. 

[0074] At block 208, a second query is received that specifies a first set of one or 
more criteria. The following query is an example of a second query. 
SELECT name, phone FROM traders 
WHERE EVALUATE (interest, 

•SYMBOL='ORCL', PRICE=>31, CHANGE=>5-2) = 1 AND 
city = 'New York' AND age < 30; 
This query specifies the criteria, SYMBOL is equal to "ORCL", PRICE is greater than or 
equal to 3 1 , and the CHANGE in PRICE is greater than or equal to 5 .2. The second 
query may be received, for example, firom a user of a database application. 
[0075] In an embodiment, the second query is received fi*om a publisher in an 
information subscription system, or other event or content-based information pubUcation 
system, wherein the expression specifies a set of criteria that must be met by data for the 
data to be published to a particular subscriber. Thus, the second query can serve as a 
mechanism for a second level of filtering with respect to system data, essentially 
processing data for dispatch or publication to appropriate subscribers. 
[00761 At block 210, the second query is executed to select data based at least on 
whether conditional expressions in column 106 satisfy the first set of criteria. For 
example, execution of the second query determines, among other things, whether a 
particular data item meets a condition or set of conditions as expressed in a stored 
expression in the data column 106. In other words, in the context of a subscription 
system, it is determined whether there are any subscribers that are interested, through the 
conditions or predicates specified in their expressions, in data meeting the criteria 
specified in the first part of the WHERE clause (before the AND conjimction) of the 
query. 
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■OOTTl lnanembodim«.t.fi»sccandqu«y&*=rspecifi.sasecond.e.o,^^ 

Lg nLe««n one .eve. or direcdon of fil.^^. For «c.^le. « ,ue,v 

rve™.bo^subscnW»dp*li*---^m»inS*-'--- 
rprldesc..edc.o^ra«i.a.Wonn^oo,eo„..yreU..di^.— 

L.d set Of crt^ia ^cpresstog security or a««rori.a«oo ori»« regardu.g » v,ho. 

particular mfoimation can or should be published or provided 

PROCESS FOR EVALUATING AN EXPRESSION STORED AS DATA IN A 
COLUMN OF A DATABASE TABLE 
,0079, HG.3Aisaflowchart,hatillus.ra.esap«cesstoreval„atirrgane.press.on 

set that is stored as data in a column of a table. 

tm A.Hoolc30Z.eachpredicateformeache.pressio«of.heexpress.ou«.. 
IsiLas one of an indexed aUnbute predicate, a stored ^tribute pred-cate anda 
tarse predicate. An indexed attribute predicate is a predica. that. s chosen. o be 
Ilxel as described above, wherein the index is based on a set of opera«. and cot^t 
^1 tba. ^ stored in respective coh^ of a predicate table in assoca^on ™th a 

r„:r:t: — Ptedicateisapredicatef^wMcbitso^andoo^. 

lbu.es are stored in respective colunn. of the P™^--'*'' — 
respectivcide„ffier,butforwhichnoindexiscrea.ed F„„xample, stored at«b«e 
;i:tes.ni^.no.beindexedbecause their id.tifiersarenotco.runonlyus^.0^ 

L d,e expression se, or *ey nu^t be complex pre^ca»s compnse operafons 
basic attributes (e.g., CHANQEffRICE). 

(0082, Asparsepredicatcasdescribedabovcisapredioatemwhtchapr^ca^ 
LL associated ^erewithisunconnnonintheexpres^onset For examplcasparse 
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predicate might be derived from a predicate with a BETWEEN operator, with multiple 
operators on the same attribute, with functions, and with a LIKE operator. A sparse 
predicate, and its associated operator and constant, is stored as data in the predicate table. 
[0083] At block 304, the expression set is filtered based on the classification of 
predicates performed at block 302. FIG. 3B is a flowchart that illustrates the step of 
filtering expressions based on predicate classifications, block 304. 
[0084] At block 304A, the indexed attribute predicates are filtered first to obtain a 
first set of expressions that includes one or more expressions for which all of its indexed 
attribute predicates are true for a given data item. Second, at block 304B, the stored 
attribute predicates of the first set of expressions are filtered to obtain a second set of 
expressions that include one or more expressions for which all of it stored attributes are 
true for the given data item. Third, at block 304C, the sparse predicates of the second set 
of expressions, if any, axe filtered to obtain a third set of expressions that includes one or 
more expressions for which all of its predicates are true for the given data item. Hence, 
the data item meets the criteria specified in the expression. 

HARDWARE OVERVffiW 
[0085) FIG. 4 is a block diagram that illustrates a computer system 400 upon which 
an embodiment of the invention may be implemented. Computer system 400 includes a 
bus 402 or other communication mechanism for communicating information, and a 
processor 404 coupled with bus 402 for processing information. Computer system 400 
also includes a main memorj' 406, such as a random access memory (RAM) or other 
dynamic storage device, coupled to bus 402 for storing infomiation and instructions to be 
executed by processor 404. Main memory 406 also may be used for storing temporary 
variables or other intermediate information during execution of instructions to be 
executed by processor 404. Computer system 400 further includes a read only memory' 
(ROM) 408 or other static storage device coupled to bus 402 for storing static information 
and instructions for processor 404. A storage device 410, such as a magnetic disk, optical 
disk, or magneto-optical disk, is provided and coupled to bus 402 for storing information 
and instmctions. 

[0086] Computer system 400 may be coupled via bus 402 to a display 412, such as a 
cathode ray tube (CRT) or a liquid crystal display (LCD), for displaying information to a 
computer user. An input device 414, including alphaniuneric and other keys, is coupled 
to bus 402 for communicating information and command selections to processor 404. 
Another type of user input device is cursor control 416, such as a mouse, a trackball, or 

-17- 



BNSOOCID: <V/0 030g8479A2J_> 



PCT/US03/14892 

WO 03/098479 

tt,nroc«sor404andfetcontrollingcursoimovtinaitond.splay41.. P 
topropKsoi™»«" ■ , and a second axis 

typicaUy has two degrees of toedom m wo axes, a first axis (e.g., x) an 
(ee y) that allws the device to spec*' positions in a plane. 

T.einven.ionistela.ed.o.heuseofcon*u.e.syste.400for^p™ 
rtLquesdescribedherein According to one embodiment of the invenUox. those 

s« P-rf-ed by coniputet syste. in response .0 processoi .04 e^^^^ 

^.ore ^ of one or more instructions contained in main memory 406. SucH 
i^onsmaybereadintomainmemory 406 ftom another computer-readable 

^«Uum. such as storage device 410. Execution of the sequences of instrucUons 
contained in main memory 406 causes prt^essor 404 to perform the process steps 
rescribedherein.toalteniaUveembodim».s,hard..iredcircuitrymaybeused,n^ 

oformcombinadonwithsoftwareinstruodonstoimplementdiemvention. Th,., 

embodiments of the invention are not hmited to any specific combmahon of hardware 

circuitrv and software. _ . 

■00887 Thete.m"computer-readable»edi«m"asusedhereinreferstoanymed.mn 

rpLcipatesinprovidingins.ructions.opmcesaor404forexec«ionSu^^ 

„ayl.e many forms. inclndhigbutnotlinu.edto,non-volatilemedia.vola.,leme^a, 

l™issioumeai.Non-vo«i.emediainc,udes,fcr example, optical,magneuc, or 

magne.o-op.icaldisks.suchass«^edevice410. Volatile media includes dynamic 
m^oiy, suchasmainmemory406. Transunssion media mdudes coaxial cables, copper 

memoiy,^ n«rr,T,ri<5ehus 402 Transmission media can 

wire and fiber optics, including the wires that compnse bus 40.. ir 

also talce the form of acoustic or Ught waves, such as those generateddurmgradi^^^^^ 
and infta-red data communications. 

,0089, Common forms of compu,er-r«d*le media include, for example, a floppy 
disk aflexibledisk.harddiak.magnetictape,oranyotemagneUcmedium.aCD- 
aianyother optica, medium. punchc^,pap^«. -mother physicalmediumwi^ 
pirns of holes, al^, aPKOM. andEPROM, aFlASH-EPROM. any oth^memory 
chip or cartridge, a carrier wave as described h«inafter, or any othermedium ftom 

which a computer can read. . 

[0090, various foimsofcomputerre^lememamaybeinvolvedmcanymgone or 

Lre sequences ofoneormoreinstructionstoprocessor 404 fbr execution For exam^^^^ 
^ instrucdons may initiaUy be carried on ama^etic disk of a remote compter. Tire 
^ote computer can load the instructions into its dynamic memory and send to 
tas^ucUons over a telephone luie using a modem. A modem local to computer system 
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400 can receive the data on the telephone line and use an infra-red transmitter to convert 
the data to an infra-red signal. An infra-red detector can receive the data carried in the 
infra-red signal and appropriate circuitry can place the data on bus 402, Bus 402 carries 
the data to main memory 406, from which processor 404 retrieves and executes the 
instructions. The instructions received by main memory 406 may optionally be stored on 
storage device 410 either before or after execution by processor 404. 
[0091] Computer system 400 also includes a communication interface 418 coupled to 
bus 402. Commtmication interface 418 provides a two-way data communication coupling 
to a network link 420 that is connected to a local network 422. For example, 
communication interface 418 maybe an integrated services digital network (ISDN) card 
or a modem to provide a data communication connection to a corresponding type of 
telephone line. As another example, communication interface 418 may be a local area 
network (LAN) card to provide a data communication connection to a compatible LAN. 
Wireless links may also be implemented. In any such implementation, communication 
interface 418 sends and receives electrical, electromagnetic or optical signals that carry 
digital data streams representing various types of information. 

[0092] Networic link 420 typically provides data communication through one or more 
networks to other data devices. For example, network link 420 may provide a connection 
through local network 422 to a host computer 424 or to data equipment operated by an 
Internet Service Provider (ISP) 426. ISP 426 in turn provides data communication 
services tbrou^ the world wide packet data communication network now commonly 
referred to as the "Internet" 428. Local network 422 and Internet 428 both use electrical, 
electromagnetic or optical signals that carry digital data streams. The signals through the 
various networks and the signals on network link 420 and through communication 
interface 418, which carry the digital data to and from computer system 400, are 
exemplary forms of carrier waves transporting the information. 

[0093] Computer system 400 can send messages and receive data, including program 
code, through the netsvork(s), network link 420 and coramimication interface 418. In the 
Internet example, a server 430 might transmit a requested code for an application program 
through iQtemet 428, ISP 426, local network 422 and communication interface 418. 
[0094] The received code may be executed by processor 404 as it is received, and/or 
stored in storage device 410, or other non-volatile storage for later execution. In this 
manner, computer s>^tem 400 may obtain application code in the form of a carrier wave. 
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EXTENSIONS AND ALTERNATIVES 
(OOW, Alt^nativc »boto«..s of *e invention described yf^'^' 
L Jing description, and in location, .hat best fi^Utate «d»g .he co„te« f me 
I^dimea.s.Fur*ennore.meinven.ionhas been described wim-^erencetospe.^^^^^ 

^bcmade.he^.0 wid«>ntd=partingfion.thebxoaderspm. and scope offte—n. 
FoLan^ie. i.nplemen.adonswerepres«.edin which SQLis used; however. *e 
,^,„esd=soribedhereinareno.linn.ed.ousewimSQL.foro*erda.a,ue,y 
languages n>ay be appUcabl. For «>o«»r exan^le. inaple=aen.ations were presenWm 

fcc con«x, of a subscriber/publisher sys«n; however, advantages and use of 
e„bcdimen.s of the tavention are no. limi«d «, Ms con«xt For one more ex^ple, 
taplementations werepres«.«dinwhich aconcate^binnap inde. . «ated on 
coL. of a predica«.able. However, embodin^ents are no. hnatted to su* an md^x, for 
o,„cr ,ndcx .>^es .ha. are sui«,le for indexhrg multiple col«.nns of dam.^l=s are also 
applicaWc^ Therefore, *e speciflcadon and drawing, are. accordingly, .o he regarded n> 
an illustrative rather than a restrictive sense. 

,00.6, h,add,„on,inthisdescriptioncer.ainprocessstepsarese.forarn>apart,cular 

crder and alphabetic and alphanumeric labels maybe ns=d«, identic oeriain s«ps. 
Uriels speciGcally stated intedescriptton, embodiments ofteinv«a..»narenot 

necessarily hmi.ed .o any parrieular ord« of carrying out such steps, to parbcular Are 
labds are used merely tor convenient identification of steps, and are not mtended to 
specify or require a particular order of carrying out such steps. 
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CLAIMS 

What is claimed is: 

1 . A method for managing expressions in a database, the method comprising the 
steps of: 

receiving a first query that includes a first conditional expression; 
representing the first conditional expression as data in a column of a table; 
receiving a second query that specifies a first set of one or more criteria; and 
executing the second query to select data based at least on whether conditional 
expressions in the column satisfy the fiurst set of criteria. 

2. The method of Claim 1, wherein the second query fiirther specifies a second set of 
one or more criteria, and wherein the step of executing the second query includes 
selecting data based on whether data in one or more columns other than the 
column satisfies the second set of criteria. 

3. The method of Claim 1, wherein the step of receiving a first query includes 
receiving tiie fibrst query from a first user of a database appUcation, and wherein 
the first conditional expression specifies one or more criteria that must be satisfied 
by data for the data to be of interest to the user. 

4. The method of Claim 3, wherein the step of receiving the second querj' includes 
receiving the second query fi-om a second user of a database application, and 
wherein the second query specifies a second set of one or more criteria that must 
be satisfied b}^ data for the data to be made available to the first user, thus 
enabling mutual filtering among the first user and the second user. 

5. The method of Claim 1, wherein the step of receiving a first query includes 
receiving the first query from a subscriber to an information subscription system, 
and wherein the first conditional expression specifies one or more criteria that 
must be satisfied by data for the data to be of interest to the subscriber. 

6. The method of Claim 5, wherein the step of receiving the second query includes 
receiving the second query fi-om a publisher in the subscription system, and 
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M ^= second query specifies . secoBd se. of one or more criteria flrat ^ 
be satisfied by data for the data to be pubSshed to the subscnber. 

, Tteme.hodofaaiml,whereirrthestepofre=eiviBg.heseooad,uerytooludes 
■ ^ivingtoseoondWftomapublisheriBanirrformatioBSubscnpUon 



more 



9. 



10. 



system, and wherein the second query specifies a second set of one or 
L^thatmustbesatisfiedbydataforthedatatobepubUshedtoasubscrxberto 

the subscription system. 

■ttemethod of Clahn I, wherein the step of representing a first conditional 
expressionasdatainacolutnninolu^n^-eatmgaset of first condruon.1 

e^ressio,^ fiom a phmOity of first queries as data in the cotanm, the method 

further comprising the steps of: 
creatinganindexonthecolunmofthetableby 

groupingpredicates from the set offirst conditional expressions based on predicate 

identifiers that axe associated with respective criteria of the first set of enter.; 
storing in a predicate table sets of predicate operators and constants m assocxahon 
with respective predicate identifiers; 

creating a concatenated bitmap index based on a set of predicate operators and 
constants associated with a particular predicate identifier. 

A computer-readable medium carrying one or more sequences of instructions 
which, when executed by one or more processors, causes the one of more 
processors to perform the any one of the methods recited in Clamxs 1-8. 

Amethod for evaluating an expression set that is stored as data inacolumnofa 
table, the method comprising the steps of: 

classifying each predicate fiom each expression of the expression set as one of, 

an indexed attribute predicate, wherein abitm^ index is created based on 
a set of operator and constant attributes that are stored in respective colmmis of a 
predicate table in association with a respective predicate identifier; 

a stored attribute predicate, wherein a set of operator and constant 
attributes are stored in respective columns of the predicate table in association 
witharespectivepredicateidentifier, and wherein no index is created on the set of 

Operator and constant attributes; 
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a sparse predicate, in which a predicate identij5er associated with a sparse 
predicate is uncommon in the expression set, and wherein a sparse predicate is 
stored as data in the predicate table; and 

filtering the expression set based on the classification of predicates firom the 
expression set.. 

1 1 . The method of Claim 1 0, wherein the step of filtering the expression set includes 
the steps of: 

first, filtering the indexed attribute predicates to obtain a first set of expressions 
that includes one or more expressions for which all of its indexed attribute 
predicates are trae for a given data item; 

second, filtering the stored attribute predicates of the first set of expressions to 
obtain a second set of expressions that includes one or more expressions for which 
all of its stored attributes are tme for the given data item; and 
third, filtering the sparse predicates, if any, of Ihe second set of expressions to 
obtain a third set of expressions that includes one or more expressions for which 
all of its predicates are true for the given data item. 

1 2. The method of Claim 1 1 , fiirfher comprising the step of: 

publishing the given data item to a subscriber to an information subscription 
system that has expressed interest in data that meets criteria represented by an 
expression from the third set of expressions. 

13. A computer-readable medium carrying one or more sequences of instructions 
which, when executed by one or more processors, causes the one or more 
processors to perform the any one of the methods recited in Claims 10-12, 

14. A computer-readable medium storing an index that is built on a table that has a 
plvirality of rows, the index comprising: 

concatenated bit vectors, wherein each bit vector is associated with one of an 
operator and a constant that are each stored in respective colimms of the table; 
wherein the operator and the constant are associated with a predicate of one or 
more predicates fi-om a conditional expression that specifies one or more criteria 
that must be satisfied by data for the data to be of interest to a subscriber to a 
subscription system. 
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1 5 An apparatus for managing expressions in a database, comprxsmg: 

means for receiving a first query that includes a first conditional expression; 
means for representing the first conditional expression as data in a colmnn of a 
table; 

means for receiving a second query that specifies a first set of one or more 
criteria; and 

means for executing the second query to select data based at least on whether 
conditional expressions in the column satisfy the first set of criteria. 

16. An apparatus for evaluating an expression set that is stored as data in a colmnn of 
a table, comprising: 

means for classifying each predicate from each expression of the expression set as 
one of, 

an indexed attribute predicate, wherein abitmap index is created based on 
a set of operator and constant attributes that are stored in respective colmmis of a 
predicate table in association with a respective predicate identifier; 

a stored attribute predicate, wherein a set of operator and constant 
attributes are stored in respective columns of lixe predicate table in association 
with a respective predicate identifier, and wherein no index is created on the set of 
Operator and constant attributes; 

a sparse predicate, in which a predicate identifier associated with a sparse 
predicate is uncommon in tiie expression set, and wherein a sparse predicate is 
stored as data in the predicate table; and 

means for filtering the expression set based on tiie classification of predicates from 
the expression set. 
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